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SUMMARY 

Schur-optimality  is  defined  (in  the  general  setting  of 
a  linear  model)  as  a  generalization  of  the  well-known  D-, 
A-  and  E-optimality  criteria.  Techniques  to  establish 
Schur-optimality  are  outlined,  based  chiefly  on  a  process 
of  averaging  information  matrices  and  on  vector  majoriza- 
tion.  A  design  with  a  completely  symmetric  information 
matrix  of  maximal  trace  is  shown  to  be  Schur-optimal. 

A  design  with  an  information  matrix  of  maximal  trace  and 
exactly  two  distinct  nonzero  eigenvalues  is  proved  Schur- 
better  than  a  large  class  of  designs.  One  description 
of  a  subcollection  of  designs  over  which  Schur-optimality 
holds  is  given  only  in  terms  of  the  diagonal  elements  of 
the  information  matrices.  Consequences  of  this  are  then 
examined  in  the  setting  of  block  designs. 
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,  1 .  INTRODUCTION 

Oftentimes,  in  experiments,  interest  arises  in  estimating 
parameters  with  equal  (or  near  equal)  precision.  A  design 
balanced  as  much  as  possible  for  the  parameters  of  interest 
is  intuitively  felt  to  be  the  right  choice  since  no  para¬ 
meter  ought  to  be  left  to  disadvantage.  But  precision  (for¬ 
mulated  in  terms  of  the  covariance  matrix  of  the  estimates) 
turns  out  not  to  be  related  to  the  concept  of  balance  nei¬ 
ther  in  a  direct  way  nor  in  an  obvious  one.  While  balance 
relates  directly  to  the  entries  of  the  information  matrix, 
precision  is  closely  connected  to  its  spectrum.  In  the 
present  paper  a  connection  between  balance  and  precision  is 
made  with  the  help  of  the  concept  of  Schur-optimality . 

Certain  convex  operations  on  the  entries  of  the  information 
matrix  (as  trends  for  balance)  turn  out  to  be  compatible 
with  the  Schur-convex  functions  defined  on  the  spectrum 
of  the  information  matrix,  as  measures  of  precision.  By 
compatibility  we  roughly  mean  that  the  more  balanced  a 
design  tends  to  be,  the  closer  to  a  minimum  the  values  of 
the  Schur-convex  functions  get.  Our  goal  is  to  eventually  find 
a  design  for  which  these  Schur-convex  functions  reach  their 
minima.  As  we  define  it,  the  concept  of  Schur-optimality  is 
very  strong  and  a  design  can  only  seldom  be  proved  Schur- 
optimal  over  the  collection  of  all  possible  designs.  It  is 
however  quite  convenient  to  show  that  whenever  a  design  with 
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many  desirable  symmetries  exists,  it  is  better  ( in  a  very 
strong  sense)  than  a  large  number  of  less  symmetric  designs. 
This  is  the  main  scope  of  the  paper. 

A  design  d  describes  the  way  a  certain  stastical 
experiment  is  to  be  conducted.  It  generally  specifies 
where  the  observations  are  to  be  taken  and  in  what  propor¬ 
tion  at  each  location.  The  design  is,  within  limits, 
subject  to  choice  by  the  experimenter. 

Let  0  be  the  collection  of  all  possible  designs.  For 
d  e  n  we  assume  the  fo] lowing  expected  linear  response: 

E(Y)  =  Xd  e 

where  Y  is  the  mxl  vector  of  uncorrelated  observations 

2 

with  common  variance  a  ,  is  the  design  matrix  of  dim¬ 
ension  nxp  and  0  is  a  pxl  vector  of  unknown  parameters. 
Usually  statistical  interest  arises  in  estimating  linear 
functions  of  0 1  ,  a  vxl  subvector  of  0  .  Then  the  reduced 
normal  equations  for  0^  can  be  written  as 

cd  **  Y 

with  Qd  a  mxv  matrix  and  a  vxv  nonnegative  def¬ 

inite  matrix,  called  the  information  matrix  of  the  design  d 
(for  0!  ).  Unless  otherwise  specified,  all  the  information 
matrices  Cd  in  the  sequel  will  be  with  reference  to  the  sub¬ 
vector  0 ^  . 

In  the  block  design  setting,  for  example,  we  are  to 
compare  v  varieties  (labeled  l,2,...,v)  via  b  blocks 
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of  size  k  (<v).  A  design  d  in  this  case  is  a  kxb  array 
with  varieties  as  entries  and  blocks  as  columns.  The  collec¬ 
tion  of  all  designs  is  denoted  by  o  The  usual  additive 

v»b,K 

model,  under  which  these  designs  are  considered,  specifies 


the  expectation  on  variety  i  in  block 


where  ct^  is  the  (unknown)  effect  of  variety  i  and  {3^ 
is  the  (unknown)  effect  of  the  jth  block.  Let 


9  53  ( »  •  • .  $  ay  $ 

be  the  vector  of  unknown  parameters.  We  are  interested 
only  in  the  subvector  a  =  (<^,...,<1  )’  of  variety  effects. 

The  information  matrix  for  a  ,  when  the  design  d  is  used 
for  estimation,  is 

Cd  =  diag(rdl . rdv)  -  indN- 

where  rd^  is  the  number  of  replications  of  variety  i  in 
d  and  N=(ndid),  with  n^j  signifying  the  number  of  times 
variety  i  occurs  in  block  j.  The  above  information  matrix 
is  nonnegative  definite  with  row  sums  zero., for  all  d  c  K  u- 

V  f  D  f  . 

The  row  sums  being  zero  reflects  the  fact  that  only  linear 
contrasts  (i.e.  functions  t'cx  with  i'  1  =  0,  where  1  =  (l,...,l)') 
are  estimable  under  any  design  d.  This  is  very  often  the 
case  in  discrete  settings.  The  collection  of  information  ma¬ 
trices  ,  with  d  e  ny  ^  ^,has  therefore  a  common  kernel 

generated  by  1  . 

Denote  by  u(0  the  nondecreasingly  ordered  vector  of 
eigenvalues  of  Cd  outside  an  eventual  common  kernel  (which 
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is  of  no  help  in  distinguishing  among  designs  ).  Let  u(C^) 
be  a  vector  of  length  n(  <L  v).  In  most  relevant  instances 
n  is  either  v  or  v-1 .  Designs  d  for  which  u(cd) 
has  some  zero  entries  are  ruled  out  as  bad  designs,  generally 
because  they  are  disconnected.  We  shall  therefore  focus  our 
attention  to  designs  d  for  which  u(Cd)  has  all  its  entries 
positive.  In  summary,  to  a  design  d  £  ft  we  associate  an  in¬ 
formation  matrix  and  the  nondecreasingly  ordered  vector 

u(Cd)  of  its  eigenvalues  associated  with  the  eigenvectors 
of  outside  a  kernel  common  to  all  Cd  with  d  e  0,  i.e., 

u(Cd)  -  (udl>****udn)'  »  0  <  ^dl  ^  ^dn  • 

Statistical  optimality  criteria  (e-.g.  D- , A- , E-criteria) 

were  defined  on  the  collection  of  vectors  u(Cd)  ,  with  den 
by  Ehrenfeld  (1955)  and  Kiefer  (1959)  •  A  design  d*  is  called 

v-1  v-1  _1 

D-,A-  or  E-optimal  if  it  minimizes  n  u  T  m.i?  or 

i=j  di  i=i  U1 

respectively,  over  all  d e  n.  Minimization  of  functions 
v-1 

of  the  form  p  f(u  .,)  with  f  convex  has  been  considered 

;U1 

by  Kiefer  (1975)  and  Cheng  (1978)  .  While  D-  and  A-criteria 

can  be  easily  expressed  in  this  latter  form,  the  E-criterion 

cannot  be  expressed  in  such  form.  It  can  only  be  recovered 

as  a  limit  of  such  functions.  In  the  first  part  of  this  paper 

v-1 

we  extend  criteria  of  the  form  T  f(ndi)  with  f  convex 

i-1 

to  t (  m ( cd ) )  with  (  Schur-convex  and  nonincreasing  in  its  ar¬ 
guments.  The  three  functions  associated  with  D-,A-  and  E-cri- 


teria  are  all  genuine  instances  of  Schur-convex  functions  which 
are  nonincreasing  in  their , arguments .  We  then  define  the  concept 
of  a  Schur-optimal  design  and,  in  section  3,  outline  general 
methods  of  establishing  Schur-optimality.  The  techniques  which 
we  mention  rely  chiefly  on  convexity  (averaging  information 
matrices)  and  on  vector  majorization.  In  the  last  section 
of  the  paper  we  illustrate  the  material  presented  in  section  3 
by  showing  that  a  design  with  a  completely  symmetric  matrix  of 
maximal  trace  is  Schur-optimal.  This  result  very  much  resembles 
Proposition  1  of  Kiefer  (1975)  on  Universal  optimality.  There 
is  no  direct  relationship  between  Schur-optimality  and  Universal 
optimality  but  the  concepts  are  discussed  comparatively  in  sec¬ 
tion  4.  We  then  show  that  a  design  with  an  information  matrix 
of  maximal  trace  and  exactly  two  distinct  nonzero  eigenvalues 
is  Schur-optimal  over  classes  of  designs  which  satisfy  a 
verifiable  condition  on  the  entries  of  their  information  matrices. 
For  a  subclass  of  designs  a  sufficient  condition  is  conve¬ 
niently  formulated  only  in  terms  of  the  diagonal  entries  of 
the  information  matrices.  In  the  block  design  setting  corol¬ 
laries  are  derived  for  binary  designs  with  exactly  two  nonzero 
eigenvalues.  Examples  of  such  designs  are  the  Partially  Bal¬ 
anced  Incomplete  Block  designs  with  two  associate  classes,  as 
well  as  extended  and  abridged  both  Balanced  Incomplete  Block 
designs  and  Group  Divisible  designs. 
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2.  DEFINITIONS 
Let  us  recall  the  followings 

A  square  matrix  is  called  doubly  stochastic  if  it  has 
nonnegative  entries  with  row  and  column  sums  equal  to  1 . 

Let  I  be  an  interval  on  the  real  line  R  .  A  function  » 
defined  on  In  with  real  values  is  called  Schur-convex  if 

t  (sx)  <:  $  (x) 

for  all  x  e  In  and  all  S  doubly  stochastic. 

A  real  function  f  defined  on  In  is  said  to  be 
nonincreasing  in  its  arguments  if  it  is  a  nonincreasing 
function  when  restricted  to  each  of  its  arguments. 

A  function  F  defined  on  In  is  called  symmetric  if 
F(Px)  =  F(x)  for  all  x  e  In  and  all  permutation  matrices  P. 

A  function  defined  on  a  convex  set  A  in  Rn  with 
real  values  is  called  convex  if 

<f>(ax+(l-ci)y)  <:  a<}>(x)  +  (l-a)(j>(y) 

for  all  x,y  e  /v  and  all  0  £  a  £  1. 

By  observing  that  a  permutation  matrix  (and  its  inverse) 
is  doubly  stochastic  one  can  see  that  a  Schur-convex  function 
is  always  symmetric.  The  converse  is  of  course  false,  but 
the  well-known  result  of  Birkhoff  and  Von  Neumann  which  states 
that  the  collection  of  doubly  stochastic  matrices  is  the  convex 
span  of  permutation  matrices  provides  us  with  many  examples  of 
Schur-convex  functions.  By  this  result  it  immediately  fol¬ 
lows  that  any  symmetric  and  convex  function  is  Schur-convex. 
Lastly,  a  Schur-convex  function  need  not  be  convex,  e.g., 

v 

*(X1'X2>  *  I xl“x2 I  2. 
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Let  b  be  a  constant  such  that  .u  b  for  all  1  £  1  £  n 

and  all d  e  0,  e.g.,  b  can  be  the  maximal  trace  of  with 

d  r  Q.  The  constant  b  is  always  finite,  either  because  0 
is  finite  or  because  of  arguments  involving  compactness. 

Let  I  -  [0,bJ  .  Then  u(C^)  e  ln  for  all  d  c  D  .  So  we 
can  define  '>(u(C{j))  for  ♦  Schur-convex . 

For  convenience  we  set  by  definition  i|r(Cd)  he  *(u(Cd)) 
for  ’(i  Schur-convex  and  nonincreasing  in  its  arguments.  We 
are  now  ready  to  define  Schur-optimality . 

Definition  2.1.  A  design  d*  is  said  to  be  Schur-better  than 
another  design  d  (notation  d*  ^  d)  if 

*<cd.)  £  *(cd) 

for  all  Schur-convex  functions  $  nonincreasing  in  their  ar¬ 
guments  . 

Definition  2.2.  A  design  d*  in  0  is  called  Schur-optimal 
over  (2  if  *(cd*)  £  t(cd) 

for  all  Schur-convex  functions  ^  nonincreasing  in  their  ar¬ 
guments  and  all  designs  d  in  0  . 

n  n  -1 

Letting  *(c.)  be  -log  r  ^di  » .  r\  udi  *  "wdl  * 

U  lsl 

we  obtain  the  well-known  functions  associated  with  D-,A-  and 
E-optimality  criteria,  respectively.  The  above  functions  are 
Schur-convex  because  they  all  are  symmetric  and  convex. 
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n 

*(Cd)  =  s  f(udl)  with  f  convex  nonincreasing  and  the 

criteria  defined  by  Kiefer  (197*0  are  also  Schur-convex  for 
the  same  reason.  It  is  clear  that  all  these  functions  are 
n.onincreasing  in  their  arguments.  Note  that  the  E-criterion 
is  not  a  limiting  case  when  formulated  in  terms  of  Schur- 
convexity . 

Schur-optimality  is  a  very  strong  criterion,  as  it  im¬ 
plies  all  of  the  above.  As  the  above  examples  indicate,  it 
it  probably  quite  satisfactory  to  look  at  just  symmetric 
and  convex  functions  of  eigenvalues,  rather  than  Schur-convex. 
But  the  techniques  that  we  shall  outline  readily  apply  to 
Schur-convex  functions  and  this  motivates  the  extension. 

3.  ON  AVERAGING  AND  MAJORIZATION 

The  principal  tool  that  we  shall  employ  when  searching 
for  Schur-optimal  designs  is  contained  in  Theorem  3*2.  This 
theorem  relies,  in  turn,  on  a  fundamental  result  on  majori- 
zation  due  to  Hardy,  Littlewood  and  Polya  (193*0  ,  which  was 
later  extended  by  Ostrowski  (1952) • 

Whenever  we  write  x^y  for  two  vectors  x=(x^ »Xg , . . . ,x  ) 
and  y=(y1»y2 . yn)  in  In  we  assume  that 

xi  ^  x2  £  £  xn  ,  v1  y2  •••  0  yn 

ra  m 

and  that  r  <  v  x1  for  all  1  <  m  <  n. 
i— 1  1>1 

If  x  4  y  we  say  that  y  majorizes  x. 
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The  useful  concept  of  majorization  has  been  considered  in 
the  context  of  design  optimality  by  Cheng  (1979)  to  block  designs 
with  4  varieties  and  an  arbitrary  number  of  blocks. 

The  following  result  can  be  easily  derived  from  Ostrow- 
ski  ( 1 95  2 )  s 

Theorem  3.1.  Let  x,yeln.  If  x^y  then  $ (x)  £  f(y) 
for  all  Schur-convex  functions  $  nonincreasing  in  their 
arguments. 

Denote  by  a(A)  the  nondecreasingly  ordered  vector  of 
eigenvalues  of  the  matrix  A.  The  result  we  state  next  is 
essentially  due  to  Ky  Fan  (1951)  although  he  has  not  formulated 
it  in  this  form. 

Lemma  3.1.  Let  A ^  (l£i£m  )  be  nonnegative  definite  matri¬ 
ces.  Then 

o(  E  ajAt)  4  Z  <*i  o(Ai) 
where  0  ^  £  1  and  E  «■  1  • 

Before  we  proceed,  we  need  to  introduce  some  notation. 

For  a  vxv  matrix  A  and  a  permutation  o  on  the  symbols 
1,2,... ,v  we  denote  by  A0  the  matrix  obtained  from  A 
after  performing  the  row  and  (same)  column  permutations  as 
indicated  by  o  .  That  is  A0  *  PAP'  ,  where  P  is  the 
vxv  matrix  representation  of  o  •  Since  A  and  A0  are 
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similar  matrices,  their  ordered  spectra  o(a)  and  o(A°) 

i 

are  identical  for  any  permutation  o  .  We  say  that  A° 
is  a  conjugate  of  A. 

The  following  is  an  immediate  but  useful  consequence 
(see  also  Magda  (1979) «  Lemma  3*2.1). 

°1 

Propostion  3.1.  Any  convex  combination  E  a^A  of  con¬ 
jugates  of  a  nonnegative  definite  matrix  A  satisfies 

o(  E  QtjA  4  o(A) 

1  1 

Now  we  can  state  and  prove 

Theorem  3.2.  A  design  d*  e  n  is  Schur-better  than  d 
(d*  *  d)  if 

o(Cd*)  4  o(  L  a^1) 

ior  some  convex  combination  of  conjugates  of  C^. 

Proof .  Firstly,  observe  that  by  Theorem  3.1  u(Cd#)  4 
implies  d*  ^  d.  Since  the  last  n  components  of  u(Cd) 
and  o(Cd)  are  the  same  and  the  first  v-n  components  of 
o(Cd)  are  zero  (for  all  den)  *  u(Cd)  e(luivalen'k 

with  o{ Cd#)  4  o(Cd)  .  So  o(Cd#)  4  o(Cd)  implies  d*  *  d. 
Using  the  assumption  and  then  Proposition  3.1  we  have 

o(Cd*)  4  CT(  S  ctjC^1)  4  J  olC^1)  -  o(Cd). 

The  last  equality  is  true  because  Cd  and  Cd  are  con¬ 
jugates  and  hence  have  the  same  spectrum.  This  concludes  the 
proof. 
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Theorem  3*2  is  helpful. in  the  following  way.  Suppose 
that  a  design  d*  e  C5  with  a  lot  of  symmetries  is  believed 
optimal  in  some  broad  sense  for  the  intuitive  reasons  men¬ 
tioned  in  the  introduction.  It  would  be  very  satisfactory 
to  show  that  for  a  large  class  of  designs  d  e  n »  d*  is 
Schur-better  than  d  (i.e.  d*  ^  d ) .  Because  of  its  balance 
d*  has  an  information  matrix  for  which  o( Cd#)  can  be 
computed.  But  o(Cd)  for  an  arbitrary  d  is  impossible 
to  calculate.  It  is  very  often  possible,  however,  to  compute 

o(  £  a^a(Cd1))for  a  convex  combination  of  certain  conju¬ 
gates  of  Cd*  The  entries  in  the  convex  combination  tend 
to  even  out  and  this  generally  facilitates  the  computation  of 
the  spectrum.  The  spectrum  of  the  convex  combination  is  a 
helpful  intermediary  between  o(Cd#)  and  c(Cd)  and  the 
content  of  Theorem  3.2  becomes  of  assistance.  We  shall  il¬ 
lustrate  this  in  the  next  section. 

-  1  m 

Of  particular  importance  is  the  average  Cd  »  -  I  Cd 

i=*l 

which  we  call  an  averaged  version  of  Cd.  We  will  be  making 
extensive  use  of  averaged  versions  in  the  next  section  and 
find  it  convenient  to  rely  on  the  followings 

Theorem  3.3.  A  design  d*  e  f|  is  Schur-better  than  d  if 
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o(Cd#)  ^  o(Gd)  for  some  averaged  version  Cd  of  C^. 

Clearly  d*  is  Schur-optimal  over  0  if  it  is  Schur- 
better  than  all  the  designs  d  in  0  . 


4.  RESULTS  ON  SCHUR -OPTIMALITY 


Let  d*  €  n  be  a  design  for  which  u(Cd*)  has  all 
its  entries  equal  to  ud# ,  Assume  also  that  the  trace  of 
Cd  is  at  most  equal  to  the  trace  of  Cd# ,  for  all  d  €  n 
We  claim  that  jj(Cd*)  4  u(Cd).  Let  u(Cd)  =  (udl»  •  •  *'Udn)'. 
n 


Then  if  E  ud^  >  ky  for  some  k  ^  n  we  have  ;>  ud* 

i=»l  ®  ^ 

n 

for  all  1  i  R  +  1  and  hence  also  E  udl  >  (n-k)ud* 

i=k+l  ai  a  * 

k  n 

This  implies  that  trCd  =  T  ud^  +  E  udj_  >  kud*  +  (n-k)ud#  * 

ial  i=k+l 

nUd*  ■  trCd#,a  contradiction.  We  therefore  have  y(Cd#)  4  w(Cd). 
Theorem  3*1  gives  now 


Theorem  4,1.  If  there  exists  a  design  d*  in  n  such  that 
u(cd#)  has  all  its  entries  equal  and  trace  Cd#  i  trace  Cd, 
for  all  d  £  0  .  then  d*  is  Schur-optimal  over  0  . 


There  are  two  particularly  useful  consequences.  Before 
we  state  them,  let  us  call  a  matrix  completely  symmetric 
if  all  its  diagonal  entries  are  equal  and  all  its  off-diago¬ 
nals  are  also  equal.  The  following  two  propositions  readily 
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satisfy  the  assumptions  of  Theorem  4.1. 

Propostion  4.1 .  Let  0  consist  of  designs  d  for  which 
Cd  has  zero  row  sums.  Then  a  design  d*  £  0  with  a  completely 
symmetric  matrix  of  maximal  trace  is  Schur-optimal  over  A 

Propostion  4.2.  If  d* e  fl  is  a  design  whose  information 
matrix  is  a  multiple  of  the  identity  matrix  and  has  maximal 
trace,  then  d*  is  Schur-optimal  over  0  . 

The  above  propostions  are  reformulations  of  Proposition  1 
and  Proposition  l'  of  Kiefer  ( 1975 )  in  terms  of  Schur- 
optimality.  Kiefer  phrased  the  aforementioned  results  in 
terms  of  Universal-optimality,  a  concept  that  proves  to  be 
very  valuable  especially  when  dealing  with  optimality  in 
regular  settings  which  permit  symmetric  designs.  The  essen¬ 
tial  difference  between  Universal-optimality  and  Schur-opti- 
mality  lies  in  the  relaxation  of  monotonicity  in  a  scalar  to 
that  in  each  individual  component  of  n(Cd)  .  This  permits 
immediate  connections  to  the  results  on  hermitian  matrices  by 
Ky  Fan  and  results  on  major ization  by  Hardy,  Littlewood  and 
Polya.  S chur -optimality  is  applicable  in  less  regular  set¬ 
tings  especially  when  showing  that  a  design  with  desirable 
symmetries  iB  Schur-better  than  large  classes  of  less  sym¬ 
metric  designs.  We  illustrate  this  next. 
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Throughout  the  remainder  of  the  paper,  0  is  assumed  to 
consist  of  designs  d  for  which  the  information  matrix  Cd 
has  row  sums  zero.  Moreover,  u(Cd)  will  be  a  vector  with 
v-1  components,  as  is  the  case  in  most  discrete  settings. 

For  a  m*m  matrix  C  =  (c.  .)  ,  let  us  denote  by  &(C) 
m 

the  quantity  (m-l)  £  c..  -  £  .  We  can  now  state  and 

i-1  11  i4J 

prove 

Theorem  4.2.  If  fl  contains  a  design  d*  such  that  u(Cd#) 
has  exactly  two  distinct  components,  ud*^  <  ud*g  • 

^d*l  of  multiplicity  r  ard  w<j*2  multiplicity  s 
(r+s  =  v-1),  then  d*  is  Schur-optimal  over  the  collection 
of  designs  d  in  n  which  satisfy  Trace  Cd  £  Trace  Cd* 
and  either  (a)  or  (b)  below j 

(a)  &(Md)  £  r(r+l)  Ud#1  for  some  (r+l)x(r+l)  principal 
minor  Md  of  Cd> 

(b)  &(Md)  ^  s(8+l)  ud*g  for  some  (s+l)*(s+l)  prin¬ 
cipal  minor  Md  of  Cd> 

Proof t  Let  d  €  0  satisfy  (a).  Write  the  information  matrix 
Cd  in  such  a  way  that  Md  is  in  the  upper  left  hand  cor¬ 
ner.  Average  Cd  over  the  first  r+1  rows  and  columns 
and  then  over  the  remaining  v-r-1  rows  and  columns.  Let 
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Cd  be  the  average  version  of  Cd  so  obtained.  Explicitly 

-  1  04 

Cd  - - rTi - -  T.  C* 

d  (r+l) !(v-r-l) !  1  d 

where  is  a  product  of  a  permutation  on  the  first 

r+l  rows  and  columns  of  f  j  with  a  permutation  on  the  last 

v-r-1  rows  and  columns.  The  shape  of  is 

(VvdJl  “  V 

where  the  upper  left  hand  corner  is  (r+l )x (r+l)  and  I 
and  J  are  respectively  the  identity  matrix  and  the  matrix 
with  all  its  entries  1.  By  adding  the  v*v  matrix  PdJ 
to  Cd,  the  eigenvalues  of  Cd  are  found  to  be  0,  ad+ad  , 
vpd  and  *>d+Yd  of  multiplicities  1,  r,  1  and  v-r-2  , 
respectively.  By  Propostion  3*1  all  these  eigenvalues  are 
nonnegative.  Next  we  show  that  u(Cd#)  4  u(Cd)  .  To  achieve 

this  let  us  denote  by  y(Cd)  the  vector  whose  first  r  entries 
are  equal  to  the  average  of  the  first  r  entries  of  u(Cd)  , 
and  whose  last  s  entries  are  equal  to  the  average  of  the 
last  s  entries  of  u(Cd)  ,  It  is  easy  to  see  that 

w(Cd)  4  p(Cd)  *  s^nce  3  r(r+*)  (ad+ad)  and  since  d 


(a^ctd)1  -  V 


-v 
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satisfies  (a)  we  have  ad  +  ad  ^  Pd*l  *  This  imPlies  that 
each  of  the  first  r  entries  of  p(Cd)  is  at  most  Hd#i 
Since  Trace  £  Trace  Cd# ,  it  now  follows  without  much 
difficulty  that  u(Cd#)  4  u(Cd)  .  This  shows  that  u(cdJ  «  u(cd) 
and  hence  also  that  o(Cd#)  4  o(Cd)  .  By  Theorem  3.3  we 
now  conclude  that  d#  is  S chur-better  than  d.  When  den 
satisfies  (b)  a  similar  averaging  yields  d*  ^  d.  This  con¬ 
cludes  the  proof  of  the  theorem. 


One  important  remark  is  in  order  here.  Theorem  4.2  does 
not  impose  any  condition  on  having  pd«i  and  Pd*2  close 
together.  It  is  easy  to  see,  however,  that  when  they  are  close 
to  each  other  more  designs  satisfy  (a)  or  (b)  in  the 
theorem  and  hence  d*  is  Schur-optimal  over  a  larger  col¬ 
lection  of  designs.  A  convenient  way  to  ensure  the  closeness 
of  ud*i  and  ud*2  is  to  demand  that  Trace (C2d#)  ^  Trace (C2d) 
for  all  d  e  0  .  Whenever  such  a  design  exists,  Theorem  4.2 
ensures  its  Schur-optimality  over  large  subfamilies  of  de¬ 
signs  in  n  .  This  is  a  helpful  fact,  as  it  eases  the  search 
for  the  D-.A-  and  E-optimal  designs  in  n  * 

Another  important  observation  relates  to  the  selection  of 
a  principal  minor  Md  for  which  (b)  holds.  Since  Cd 
is  nonnegative  definite  the  diagonal  elements  of  Cd  are 
relatively  large.  Moreover,  the  diagonal  elements  of  Md 
carry  a  lot  of  weight  (each  one  is  multiplied  by  s)  in  £>(Md)  . 


The  off-diagonals  of  are  therefore  of  much  less  importance 

when  it  comes  to  maximizing  &(M. )  over  M  .  Whence  it 

i  q  d 

makes  sense  to  first  choose  the  principal  minor  Md  of 
maximal  trace  and  check  if  condition  (b)  in  the  above  theorem 
is  satisfied  for  this  particular  principal  minor. 

There  is  one  special  case  of  Theorem  4.2  that  deserves 
mention.  It  is  the  setting  in  which  all  the  designs  in  Cl 
have  information  matrices  with  nonpositive  off -diagonals . 

Such  is  the  case  in  the  setting  of  block  designs  or  two  way 
elimination,  for  example. 

Denote  by  c  d^  the  entries  of  the  information  matrix 
Cd  in  such  a  way  that  cdn  £  cd22  ^  ^  cdv,v  * 

Theorem  4.3.  Let  0  consist  of  designs  whose  information 
matrices  have  zero  row  sums  and  nonpositive  of f -diagonals . 

If  H  contains  a  design  d*  which  satisfies  the  assump¬ 
tions  of  Theorem  4.2  then  d*  is  Schur-optimal  over  the 
class  of  designs  d  in  Cl  which  satisfy  either 

r+1  v 

£  cdll  ^  r  Md*l  or  .  J  .  °<111  >  (s+1>  Md«2 

l=v-s 

Proof:  Since  c  d-^  0,  for  i^j ,  and  the  row  sums  of  Cd 

r+1 

are  zero,  we  have  ^  *  rill  cdii  cdiJ  4 

r+1  r+1  r+1 

£  r  1J1  cdii  +i^1  cdii  "  (rfl) A  cdil  ^  r(r+l)  ^*1  where 


Md  is  the  (r+l)x(r+l)  principal  minor  of  Cd  whose  diagonal 
entries  are  i  =  l,2,...,r+l.  We  thus  satisfy  con¬ 

dition  (a)  in  Theorem  4.2,  and  hence  d*  ^  d.  By  letting 
Md  be  the  (s+l)x(s+l)  principal  minor  in  the  lower  right 

hand  corner  of  C  -  we  have  (by  simply  using  the  fact  that 

Q  v 

cdii  ^  0  for  1  t  j)  A(Md)  ^8  I  Cdll  >  s(s+l)  ^*2  . 

J  i=v-s 

We  are  now  done  by  (b)  of  Theorem  4.2.  This  ends  the  proof. 


Theorem  4.2  and  4.3  have  a  number  of  consequences  when 
considered  under  specific  linear  models.  We  shall  examine 
next  a  corollary  in  the  block  design  setting.' 

Order  the  replication  numbers  in  a  design  d  €  <V,b,k 
such  that  rd^  ^  rdg  ^  ....  £  rdv  .  By  observing  that  the 
i  diagonal  entry  of  the  information  matrix  of  a  binary 

k-1 

design  design  d  is  rd^ ,  one  has  the  following  refor¬ 
mulation  of  Theorem  4.2  in  terms  of  the  replication  numbers 

rdi- 


Corollary  4.1,  If  d*  is  a  binary  design  in 
whose  information  matrix  has  exactly  two  distinct  nonzero 
eigenvalues,  former  being  of  multipli¬ 

city  r  and  the  latter  of  multiplicity  s  (r+s  =  v-1), 
then  d*  is  Schur-better  than  all  the  binary  designs  d 
which  satisfy  either 


r+1 

xSt  rdi  ^  k-l 


JUL 


JL  *  w 


i 


U<J*1  or 


ud*2  # 
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Designs  d*  which  satisfy  the  assumptions  of 
Theorem  4.1  exist  in  many  settings.  Among  block  designs 

j 

we  mention  the  Partially  Balanced  Incomplete  Block  designs 
with  two  associate  classes,  extended  and  abridged  Balanced 
Incomplete  Block  designs  with  any  number  of  disjoint  bi¬ 
nary  blocks  and  Group  Divisible  designs  with  Xg  ™  Xt  +  1 
adjointed  or  abridged  by  disjoint  binary  blocks  compatible 
with  the  partition  of  the  groups  (see  Constantine  (1980)). 

When  a  pair  of  varieties  occurs  in  either  y  or  x+1 
blocks  (which  ensures  the  closeness  of  ud*i  and  V*d*2  )  • 

Theorem  4.2  asserts  that  all  these  designs  are  Schur-optimal 
over  large  classes  of  designs.  We  now  examine  one  such 
instance  more  closely. 

Let  d*  e  be  a  Group  Divisible  design  with  X2  =  +  1 

(where  X^  is  the  number  of  blocks  containing  two  varieties 
that  are  in  the  same  group)  and  m  groups  of  size  n. 

Examples  show  that  d*  is  not  Schur-optimal  (but  very  likely 
both  D-  and  A-optimal)  over  all  designs.  For  m=2  a 
very  strong  optimality  statement  (including  D-  and  A-) 
was  proved  by  Cheng  (1978)?  the  E-optimality  has  been  ob¬ 
tained  by  Takeuchi  (1961)  for  general  m. 

For  a  design  d  e  let  NdNd  =  Wij*  *  Let 

furthermore  (M>).  ,  „  be  an  arbitrary  partition  of 

x  x=x . . . . .m 
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the  v  varieties  in  m  groups  of  size  n  each  and  set 

1  m 

11  £  £  Xj  j  -i 

v(n-l)  t=l  ifj 

It  is  not  very  hard  to  show  that  the  matrix  kC^  having 
m  diagonal  blocks  of  size  n,  all  equal  to  “  X<jJ 

and  with  all  the  entries  outside  these  blocks  equal,  is  an 
averaged  version  of  kC^.  One  can  show  (see  Magda(1979)) 
that  +  x^  is  an  eigenvalue  of  kC^  and  has  multiplicity 
v-m.  Since  d*  is  binary,  we  have  4.  (k-l)r  where  r 
is  the  replication  of  any  variety  in  d* .  Now,  if  Xd  4  xi 
we  have  rd  +  Xd  ^  (k"l)r  +  X^  •  But  (k-l)r  +  xx 
is  the  smallest  eigenvalue  of  kC^*  and  it  has  multiplicity 
v-m.  With  the  assumption  that  X^  4  X^  and  using  the  fact 
that  kC^#  has  maximal  trace ,  it  can  be  now  readily  ver¬ 
ified  that  a(C^)  ^  0(0^)  *  ^  Theorem  3>3  we  therefore 

haves 

Theorem  4.3.  A  Group  Divisible  design  d*  e  n  with 

V  j  D  f  K 

m  groups  of  size  n  and  Xg  -  ^  +  1  is  Schur-optimal 
over  the  class  of  designs  d  which  satisfy  4  X^ 
for  xd  associated  with  some  partition  of  the  varieties. 


rd  “  v  £  ^  rdl  "  xdii^ 

i=l 
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